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Abstract 

In this paper we present optimization problems with biconvex objective func¬ 
tion and linear constraints such that the set of global minima of the optimization 
problems is the same as the set of Nash eqilibria of a n-player general-sum normal 
form game. We further show that the objective function is an invex function and 
consider a projected gradient descent algorithm. We prove that the projected gra¬ 
dient descent scheme converges to a partial optimum of the objective function. We 
also present simulation results on certain test cases showing convergence to a Nash 
equilibrium strategy. 


1 Introduction 

A general theory of games first introduced in [1] has found several applications in the 
held of economics and engineering. A solution concept or a notion of equilibrium was 
proposed by Nash (known as Nash equilibrium) in [2] and was shown to exist in every 
hnite normal-form game. Further generalizations of Nash equilibrium such as correlated 
equilibrium and coarse correlated equilibrium were also introduced and studied. It is well 
known that for every game the set correlated and coarse-correlated equilibria are convex 
subsets of the strategy space. But in general the set of Nash equilibria is not convex. A 
number of methods have been proposed to compute a Nash equilibium strategy. Lemke- 
Howson’s algorithm for bi-matrix games j3], global newton method [1], homotopy based 
methods[S] are some of the few methods to compute a Nash equilibrium strategy. 

For a general n-player game, the associated optimization problem is non-linear and 
non-convex and hence is difficult to solve. It is known that the problem of computing nash 
equilibria in bi-matrix games is a linear complementarity problem and for the general 
n-player scenario it is a non-linear complementarity problem. Linear complimentarity 
problems (the ones arising from games) can be solved using Lemke-Howson’s method, 
while non-linear complimentarity problems are in general hard to solve and require some 
sufficient conditions to be imposed on the problem to solve them which is not satisfied 
by every game. 

In this paper we present optimization problems with biconvex objective function and 
linear constraints such that the set of global minima of the optimization problems is the 
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same as the set of Nash eqilibria of a n-player general-sum normal form game. Global 
optimization algorithms exist that can compute the global minima of such optimization 
problems| 6 ]. The main idea in the formulation of these optimization problems is the fact 
that correlated or coarse-correlated equilibrium which are product of individual player’s 
strategy is a Nash equilibrium. We further show that the objective function is an invex 
function i.e. the set of stationary points is the same as the set of global minima. We 
also consider a projected gradient descent scheme and prove that is converges to a partial 
optimum of the objective function. 

The remainder of this paper is organised as follows: In section 2, necessary dehnitions 
and notations are stated. In section 3, functions with required properties are dehned. 
In section 4, properties of the functions dehned in section 2 are proved. In section 
5, optimization problems are presented. In section 6 , the projected gradient descent 
algorithm is stated and convergence analysis is performed. In section 7, simulation results 
of the projected gradient descent algorithm on certain test cases are presented. In section 
8 , we summarize and present directions for future research. 


2 Definitions and notations. 


In this section we shall state dehnitions, introduce variables and notations used later in 
this paper. 

A normal form game (or simply a game) (T) is dehned by tuple T =< I, > 

where, I denotes the set of players (/ = {!,..., N}), Wi G /, W denotes the set of actions 
of player i (A* = {a* : I < j < rrii}). Let A = Xjg/A* and \/i E I, u* : A —)■ M denotes 
the utility function of player i. 

For every i E I, denotes the set of probability distributions on A*. E* is identihed 
by the probability simplex A™'* C tt* = (7r*(aj),..., denotes a generic 

element of Eh Let vr =< tt^ ..., tt™* > which is identihed as a vector in x jg/A™* C 
where Mi = '^rrii. Let E = 

iei 

Let Ec denote the set of probability distributions on A. Ec is identihed by the 

probability simplex A^^ C -lyPere M 2 = H P ~ (pi^) : a G A) denotes a generic 

i&I 

element in E^;. 

For every i E I, A“* = x^kei, k^i}A^ and denotes a generic element in A“h 
Similarly, this can be extended to more than one player. Vi G /, Va“* = : k E 

I, a^E A") G A-b G Ah (a^, a"') = (a]^,. ■ ■, ■••,<) e 


m 


A. Similarly dehne \/i E I, T, ’' = and tt * denote a generic element i 

E“h Vi G /, Vvr"* = (tt^ : k E I, k ^ i, E E^) G E“*, Vvr® G E*, (vr*, tt”*) = 
(tt^..., 7i^-\ 7i\ 7r*+h..., TT^) G S. 

For every i G /, n*(7r) = ^ M*(a) ])([ vr*(ah) where a = (a®. : i E I). For every i G /, 


aeA 


i&I 


Va® G A®, Vvr ® G E ®, u'^{a),7r ®) = ^ u^{a),a ®) H where a ® = (a)’ : 


k E I, k ^ i). 




fcE/, k^i 


Foreveryi G /, n®(p) = M®(a)p(a) andVa^ G A®, M®(ah,p ®) = X] ^*( 0^50 ®) 

aSA a-®eA-® i=l 

Similarly dehne Vi G /, Vtt® G E®, Vp G E^, M®(7r®,p“®) = ^ M®(a® ,p“®)7r®(ap. 

j=i 

TT G E is said to be a Nash equilibrium strategy of the game F (or just N.E.) if 
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Vi G /, Va* e A\ M*(a*,7r“*) — u\tt) < 0 . Let NE{V) denote the set of Nash equilibria 
strategies of game L. 

p G Sc is said to be a correlated equilibrium strategy of the game L (or just C.E.) 
if Vi G /, Va*,ab G A\ ^ (M*(ab, a“*) — a“*))p(a*, a“*) < 0 . Let CE{T) denote 

the set of correlated equilibria of the game L. 

p G Sc is said to be a coarse correlated equilibrium strategy of the game L (or 
just C.C.E.) if Vi G /, Va* G A*, M(a*,p“*) — M*(p) < 0 . Let CCE{T) denote the set of 
coarse correlated equilibria of the game L. 

Dehne P : T, ^ Sc, s.t. , V tt G S, \/ a G A, P(7r)(a) = where a = 

(a*. : i & I). Let the graph of the function P be G{P) := {(7r,p) G S x Sc : p = -P(7r)}. 
In the following lemma we summarize the relationship between the various equilibrium 
concepts dehned. 

Lemma 2.1: Given a game L. The following hold. 

(1) p{NE{r)) c CE{r) c ccE{r). 

(2) p G CE{T), dTT G S, s.t., p = P(vr), then, vr G NE{T). 

(3) p G CCE{T), dvr G S, s.t., p = P{i^), then, tt G NE{V). 

The results in lemma follow directly from dehnitions. 

(tTjP) G S X Sc is a Nash equilibrium profile of game T if tt is a Nash equilibrium 
strategy of game T and p = P{ti). 

Let Ai and A2 be two convex subsets of and respectively. A function g : 
Ai X A2 — M is said to be a biconvex function if Vx G Ai, g{x, •) : A2 —?• M is a 
convex function and Vp G A2, g(-,y) : Ai —)■ M is a convex function. {x*,y*) G Ai x A2 
is a partial optimum of a biconvex function p if Vx G Ai, g{x*,y*) < g{x,y*) and 
Vp G A2, g{x*,y*) < g{x*,y). For a detailed study of biconvex functions see [ 7 ]. 

Let P be a subset of M"" and p : P —?• M. x* G P is said to the global optimum of the 
optimization problem mina, g{x), subject to, x G P, if, Vx G P, g{x*) < g{x). 


3 Objective functions. 

In this section we shall dehne functions whose set of zeros is the same as the set of Nash 
equilibria of the game T. The following theorem gives a necessary and sufficient condition 
for (vTjp) G S X Sc to be in Q{P). 


Theorem 3.1: Given (7r,p) G S x Sc. Then, (7r,p) G Q{P) iff Vi e I, Wa & 
A, p{a) - vr*(a*J YlJ=iP{a],a~'’) = 0, where a = (a*.,a“*). 

Proof : [=^] Assume {'ir,p) G GiP)- Fix i e I, a e A {where a = {afj^ : k G I)). 
Thenp(a) = and 11^6/= 

nfce/7r^(a^JEj”Iii7r*(aj) = nfce/vr^(a^J- Therefore p(a) - 7r*(a*J p(aj, a"*) = 

k^i k^i 

~ rifcei; 7r^(a^^) = 0. Since i G I, a & A are arbitrary, p{a) — 


7r*(a*J a *) = 0, Vi G / and Va G A. 

[4=] Fix a* G A where a* = (a** : i E I) = {a)* : 1 < i < N). From data, we 

di J i 

know that Va“^ G A~^, p{a^*,a~^) = 7r^(a].) Using the above, we get. 


3 


Va-1’2 e = Tr^a]*) EjLi From 

data, we also know that Va“^’^ G p(aL, a^*, = 7r^(a^*) Er=i^'(®E a“^’^). 

Therefore by substituting for the sum, we get, Va“^’^ G p(a]*, a|., = 

7r^(a|*)7r^(aj*) E^Li E^LiP(®]i) Similarly repeating the above procedure for 

actions of the third player we get, G p(ajp a|*, a|*, = 

7r3(a|)7r2(a|*)7rHa]*)ESiiE5^1iE™LiP(a]i’®i2’®i3’®“^’^’^)- Proceeding all the way 
upto player N we get, p(a*) = (nie/^*H*))(EJ7=i ••• •••’S^))- P ^ 

Sc, we know that EaeAP(«) = E^Ei ••• E™Ep(Ei’•••’T herefore, p(a*) = 
Since a* G Al is arbitrary, p(a*) = ^ ® 


Using the above theorem we now dehne a non-negative function on S x Sc such that 
the function takes the value zero on Q{P) and is positive on Q{P)^. 

Let f Tj X Sc — )■ [ 0 , oo) such that, V(7r,p) G S x Sc, /(7r,p) = E E (p(®) ~ 


i&I asA 
a=(ay ,a“p 




Corollary 3.1: Given (7r,p) G S x Sc- Then, f{ 7 r,p) = 0 iff (7r,p) G G{P)- 


From the dehnitions of coarse-correlated equilibrium and correlated equilibrium we 
now dehne the following non-negative functions on Sc such that they take the value zero 
on the set of coarse-correlated equilibria {CCE(T)) and correlated equilibria (Ci?(r)) 
respectively. 

mi 

Let Cl : Sc —)■ [0, oo), such that, Vp G Sc, Ci(p) = E E {'max{u{app~^)—E{p), 0})^ 

ieij=i 

and C2 : Sc — )■ [0, 00 ), such that, Vp G Sc, 

rrii TTLi 

C'2(p) = E E E ^ (M*(ab,a-0-M*(a},a-*))p(a},a-*),0}T 

i&I j=l j'=l a-i&A-i 

Lemma 3.1: Given p G Sc- 

• c'i(p) = oiffpGC'C'E(r). 

• c'2(p) = oiffpGC'.F(r). 

Proof : Follows directly from the dehnitions of correlated equilibrium and coarse corre¬ 
lated equilibrium in section 2. ■ 

Let B : T. X Sc —)■ [0, 00 ) s.t. V(7r,p) G S x Sc, B{ 7 r,p) = ^ ^ (maa;{M(a*,p“*) — 

i£l j = l 

M*(7rbp“*), 0})^. The idea is that when (7r,p) G G{P) and B{ 7 r,p) = 0, then, Vi G /, tt* is 
a best response to n~\ 


Lemma 3.2: Given (7r,p) G G{P)- B{7^,p) = 0 ih tt is a Nash equilibrium. 

Proof : [=»]Since B{n,p) = 0 , we have, Vi G /, Vj G {l,...,mj}, maa;{M(a*,p“*) — 
M*(7rbp“*), 0 } = 0 . Hence Vi G I, \fj G {l,...,mi}, u{Cpp~'^) — E{7r\p~^) < 0 . 
Since (7r,p) G GiP), M*(a*,p“*) = M*(a*,7r“*) and M*(7r*,p“*) = M*(7r*, tt”*). Therefore, 
Vi G /, Vj G {l,...,mi}, M(a*,7r“*) — M*(7rb7r“*) < 0 , which by dehnition of a Nash 
equilibrium strategy in section 2 , implies tt is Nash equilibrium. 
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[<^=] Since tt is a Nash equilibrium, we have, Vi G /, Vj G {1,... ,mj}, M(a*,7r“®) — 
M*(7rb7r“*) < 0. Since (n^p) G ^(P), u^{a^j,p~^) = K*(a®,7r“*) and M*(7r*,p“*) = 7r“*). 

Therefore, Vi G /, Vj G {1, .. • ,mi}, u{a^j,p~^) — u\7i\p~^) < 0, which further implies, 
WiEl, Vj G {1,.. ., mj}, maa;{M(a*,p“*) — M*(7rhp“*), 0} = 0. Thus P(7r,p) = 0. ■ 

We now characterise the set of nash equilibria of a game (T) using the functions 
/, B, Cl and C 2 . 

Theorem 3.2: Given (vr,^) G S x Sc- 

(1) ( 7 r,p) is a Nash equilibrium prohle iff /(vr,p) + Ci{p) = 0. 

(2) ( 7 r,p) is a Nash equilibrium prohle iff f{7i,p) + C 2 (p) = 0. 

(3) { 71 , p) is a Nash equilibrium prohle ih f{n,p) + B{7i,p) = 0. 

Proof : First we shall prove (1).[=^] Assume ( 7 r,p) is a Nash equilibrium. Then, by 
dehnition of Nash equilibrium prohle in section 2, tt is a N.E. and p = P(vr). By lemma 
2.1, since vr is a N.E. P( 7 r) = p G CCE{T) and since p = (7r,p) G G{P)- Thus 

f{7i,p) = 0 and C'i(p) = 0 by Theorem 3.1 and Lemma 3.1 respectively. Therefore 
/(7r,p) + C'i(p) = 0. 

[«^] Assume /(vr,p) + C'i(p) = 0 . Since both / and Ci are non-negative, /( 7 r,p) = 0 and 
C'i(p) = 0. By Theorem 3.1, /( 7 r,p) = 0 will imply ( 7 r,p) G G{P) and by Lemma 3.1 
C'i(p) = 0 will imply p G CCE{r). Since p G CCE{p) and p = P{t7), from Lemma 2.1, 
we have that vr is a N.E. Thus ( 7 r,p) is a Nash equilibrium. 

Proof of (2) is similar to that of (1) and the proof of (3) follows from Lemma 3.2 and 
corollary 3.1. ■ 


4 Properties of the objective functions. 

In this section we shall prove certain properties of the functions constructed in section 
no. First, we shall prove that / is biconvex and that Ci and C 2 are convex. 

Lemma 4.1: / is a biconvex function i.e. Vtt G E, /(tt, .) : Hq [0, 00 ) is convex 
and Vp G E^, f{-,p) : E —)■ [0, 00 ) is convex. 

Proof : \/i e I, \/a e A where a = (a*. : i e I), p{a) — is a 

linear function of p G Ec and an affine function of tt G E. By proposition 1 . 1.4 in j9], 
(p(a) — 7r*(a® J is convex in p G Ec and vr G E with the other hxed. Since 

sum of convex functions is convex, /(7r,p) = Y (p(®) “ 

i£l a£A 

a=(a*, ,a“®) 

•Jl 

is convex in p for every hxed vr G E and is convex in tt for every hxed p G Ec- B 

Lemma 4.2: Ci and C 2 are convex functions of p G Ec. 

Proof : First we shall show Ci is convex. Vi G I, Vj G { 1 ,... ,mi}, u{app~^) — M®(p) is 
linear in p G Ec- Since supremum of convex functions is convex, we have, Vi G /, Vj G 
{ 1 ,... ,mj}, max{M(a*,p“®) — M®(p),0}. Since composition of nondecreasing function and 
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convex function is convex, Vi G I, Vj G {1,..., rrii}, max{u{a’‘j,p~’‘)—u\p), 0}^, is convex. 
Therefore, Ci{p) = ^ '^{max{u{app~^) — M®(p),0})^ is a convex function. 


ieij=i 

Similarly we can show that C 2 is also a convex function. 


It is easy to show /, Ci and C 2 are continuously differentiable on an open set 
containing their respective domains (for a similar proof refer [lO]). Let V/( 7 r,p) = 
[V^/(7r,p)'^ Vp/( 7 r,p)'^]^, where V^/( 7 r,p) = : i e /, 1 < j < m*) and 

Vp/(vr,p) = : aeA). For every k e I, Vj G {1,... ,mfc}. 


df{n,p) 

dTT^{a^j) 


E E 

ier aSA 
a=(ay ,a“p 


d 


rtii 




{p{a) - Ti\a];)^p{a\,a ^)f 


i=i 


d 


ruk 


^i:z A ^ ' " -1 


asA 


-2[ E 


i=i 


m*. 




a“^GA" 


^p(a|,a '=))^p(a|,a ^)] 
i=i i=i 


So as to compute Vp/( 7 r,p), we shall write /(vr,p) = ^ ^ (/i*’“( 7 r)'^p)^, where 

ier aSA 

a=(a'.^,a“*) 

h*’“( 7 r) G s.t. Vi G /, Va G A, p(a) — 7 r*(a*J a“*) = h*’“( 7 r)'^p (which is 

possible since p(a) — 7 r*(a*J is linear in p). Therefore, 


V,/(ir,p) = 5^ 5^ V,(h‘-‘(nfpf 

iel aSA 
a=(ay ,a“‘) 

= 25 ^ (V»(^)^p)V“(^) 

is/ aSA 
a=(a®.^,a“*) 


The following lemma says that set of partial optima of /, the set of stationary points 
of / and the set of global minima of / are all the same. 


Lemma 4.3: Given (7r*,p*) G E x Ec. Then the following are equivalent. 

(1) (7r*,p*) is a partial optimum of /. 

(2) (ttEp*) is s.t. /(7r,p) = 0. 

(3) (ttEp*) is s.t. Vf{7r*,p*) = 0. 

Proof : [(1) ^ (2)]. Since {7i*,p*) is a partial optimum of /, Vp G E^, f{7i*,p*) < 
f{'K*,p). Hence, 0 < /(vr*,p*) < /(tt*, P(7 r*)) = 0. Therefore, /(vr*,p*) = 0. 

[(2) ^ (3)]. Since /(7r,p) = 0, Vi G /, Va G A, p(a) - vr*(a*J P(ap «"*) = 
0, where a = (ah,a“*). Substituting the above in the expression of 'V-,rf(7i,p) and 
Vp/(vr,p) we get, Vf{7i*,p*) = 0. 
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[(3) ^ (1)]. Since / is biconvex (from Lemma 4.1), f{.,p*) and /(tt*,.) are con¬ 
vex fnnctions. From proposition 1.1.7 in |9], we get, Vvr e S, f{7i,p*) > f{n*,p*) + 
V^/( 7 r*,p*)'^( 7 r - TV*) and Vp G Sc, f{Tr*,p) > /(vr*,p*) + Vpf{7r*,P*Vip - P*)- Sub- 
stitnting Vf{7r%p*) = [V^f{7r*,p*f Vpf{7r*,p*ff = 0, will give, Vtt G S, f{7r,p*) > 
f{7r*,p*) and Vp G Sc, /( 7 r*,p) > f{7r*,p ). Thus, ( 7 r*,p*) is a partial optimum of /. ■ 

rrii 

So as to compute VpC'i(p), we shall write C'i(p) = X] 0})^ where 

i£lj=l 

'ii G /, Vj G {1,... ,mi}, g (p*’-^)'^p = u{a),p~^) —u^{p) (which is possible 

mi 

since u{app~'^) — u^{p) is linear in p). Then VpC'i(p) =2^ ^ (maa:{(p*4)'rp^ 0 })p*’L 

ieij=i 

The following lemma says that the set of global minima of Ci and the set of stationary 
points of Cl are the same. 

Lemma 4.4: Given p* G Sc- Gi(p*) = 0 iff VpGi(p*) = 0. 

Proof : Follows directly from the expression of the gradient and the convexity of Gi. ■ 


A similar result can be derived for G 2 . In what follows in this paper results proved 
for Cl can be extended to G 2 as well. 

In theorem 3.2 we showed that the set of zeros of /(7r,p) -|- Gi(p) is the same as the 
set of Nash equilibrium profiles of the game F. In the following lemma we show that the 
set of zeros of /(7r,p) -|- Gi(p) is the same as the set of stationary points of the function 

/(7r,p) + Gi(p). 


Lemma 4.5: Given (7r*,p*) G Sc- /(7r*,p*) -|- Gi(p*) = 0 iff V(/(7r*,p*) -|- Gi(p*)) = 0. 
Proof : [=^] Since /(7r*,p*) + Gi(p*) = 0 and that / and Ci are non-negative, will imply 
that /(ttGp*) = 0 and Gi(p*) = 0. Thus, Vfin^p*) = [V^fiir^p^f \/pf{7r\p*ff = 0 
and VpGi(p*) = 0 by Lemma 4.3 and 4.4 respectively. Therefore, V(/(7r*,p*) -|-Gi(p*)) = 
[V./(7rGp*)^ iVpf{7r*,p*) + VpCi{p*))T = 0- 

HSince V(/(7r*,p*) + Gi(p*)) = [V./(7rGp*)^ (V,/(vrGp*) + VpGi(p*))^]^ = 0, 
we have Vp/(7r*,p*) VpGi(p*) = 0. (Vp/(7r*,p*) -F VpGi(p*))V = Vp/(7r*,p*)V + 
VpGi(p*)^p* = 0. By substituting the expressions for Vp/(7r*,p*) and VpGi(p*) we get, 

Vp/(7r*,p*)V = {2E E (h*-“(7r*)V)/i*’“(7r*)}V = 2E E (^'’“(vr*)V)' = 2/(7rGp*) 

iel aeA iel a£A 

mi mi 

and VpC I {p*)'^p* = {2 E E = 2E E 0})(p*4)^p* 

i^I j=l i^I j=l 

mi 

= 2E E 0 })^ = 2Gi(p*). Therefore , 0 = (V pf {n*, p*) + V pCi{p*))'^p* = 

Vp/(7r',p*) V + VpGi(p*) V = 2(/(7r*,p*) + Gi(p*)). ■ 


Lemma 4.5 shows that the function /(7r,p) -|- Gi(p) is invex. Similarly it can shown 
that /(7r,p) -f G 2 (p) is also invex. 

In following lemma we show that i? is a biconvex function. As a consequence of this 
lemma, lemma 4.1 and lemma 3.3 in [7], we get, /(vr,p) -|- B{7r,p) is a biconvex function. 


Lemma 4.6: B is a biconvex function i.e. Vvr G S, B{7r,.) : Sc —t [0, 00 ) is a con¬ 
vex function and Vp G Sc, B{.,p) : T, ^ [0, cxd) is a convex function. 

Proof : Proof is similar to that of Lemma 4.1. ■ 
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5 Optimization problems. 

In this section we shall state the optimization problems obtained using the functions 
constructed in the previous sections such that the global minima of the optimization 
problem correspond to Nash equilibria of the game F. 

First optimization problem [O.P.l) is stated below: 


(O.P.l) : 

min f{n,p) + Ci{p) 

P,p) 

subject to : 


vr'(a;.) > 0 

Vi G /, Vj G {1,... 


p{a) > 0 

rrii 

Vci G 


^x‘(a-) = l 

1 = 1 

^p(a) = 1. 
asA 

\/i G /, 


The constraints in the above optimization problem ensure that the feasible set is 
S X Ep. The second optimization problem (O.P.2) is stated below: 


(O.P.2) : 

min fij^p) + P( 
P,p) 

subject to : 

71, p) 


vr'(a}) > 0 

Vi G I, Vj G {1,... ,m^ 


p{a) > 0 

rrii 

Vci G 


^A(a-) = l 

1 = 1 

^p(a) = 1. 
asA 

\/i G /, 


The following theorem says that the set of global minima of the optimization problem 
(O.P.l) is the same as the set of Nash equilibria prohles of the game T. 

Theorem 5.1: For every game F, there exists G S x Sc s.t. /(7r*,p*) + Oi(p*) = 

0. Further given {n*,p*) G Sc, f{'^*,P*) + Ci{p*) = 0 iff {n*,p*) is a Nash equilibrium 
prohle. 

Proof : Since for every game there exists tt* G S, s.t., tt* is a N.E. (see 121). Thus by 
theorem 3.2, {7i*,p*) with p* = P{n*) satishes f{7i*,p*) + Ci{p*) = 0. The other part 
follows directly from theorem 3.2. ■ 

A similar claim can be proved for O.P.2. 

The above two optimization problems have a biconvex objective function with con¬ 
vex (linear) constraints. Global optimization algorithm exists that solves the above two 
optimization problems (see i). 


6 The projected gradient descent algorithm and its 
convergence analysis. 

In this section we shall consider a projected gradient descent algorithm to solve O.P.l. 
The algorithm is stated below: 

Input: 

• < ttojPo > • initial point for the algorithm, 

• r : the underlying game, 

• {a(n)}„>i: step size sequences chosen as follows: 

- Vn, a{n) > 0, 

- E“i«W = oo, 

- 

• H{-) : projection operator ensuring that (vr,p) remains in S x Sc. 

Output : After sufficiently large number of iterations(/im) the algorithm outputs the 
terminal strategy (7r*,p*). 


The Algorithm : 

n <(— 0, the iteration index 

while(n < lim) 



n n + 1 

end while 



^nfi'^nyPn) \ \ 
{f{'^n,Pn)+Ci{pn))J ' 


In what follows we shall present the convergence analysis of the above projected gra¬ 
dient descent algorithm. We shall analyse the behaviour of the above algorithm using the 
O.D.E. method presented in [T2]. In order to use the results from [12], we need the gradi¬ 
ent function to be lipschitz continuous on S x Sc, which is proved in the following lemma. 

Lemma 6.1: There exists L > 0, s.t., V (7ri,pi), {712,P 2 ) e S X Sc, 

nf V^/(7ri,pi) \_f V^/(7r2,p2) Vi < ril 

VVp(/(^i’Pi)+ Wp(/(^2,P2) + C'i(p2))y “ VPi/ V 2 / 

Proof : It is easy to see that the function /(■) is twice continuously differentiable 
on an open set containing S x Sc- Thus V/(-) is continuously diffrentiable on S x 
Sc- Hence ||V^/(-)|| < Li for some Li > 0. By mean value theorem, we have, V/(-) 
is Lipschitz continous with Lipschitz constant Li. Let a := max Fix 

{i,j) ■■ i£l, 
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(7ri,pi), (712, P 2 ) e S X Sc- Clearly, Vi e I, Vj G {1,... ,mi},\max{{g^’^)'^pi,0} - 
max{(g^'^)'^P2,0}\ < \(g^'^)'^(pi — P 2 )\- Therefore, we have, 

l|VCi(pi) - VC'i(p2)|| < '^\\g"’^\\\max{(g^’^)^pi,0} - max{(g^’^fp2,0}\ 

(*J) 

< -P2)| 

h-i) 

h-i) 

(id) 

= a‘^f3\\(Pi-P2)\\ 

where/3 = | Xjg/[{i}x{l,.. . ,mi}]\. Since ||pi-p 2 || = \/\\pi--p^ < \/|ki - 7r2||2 + ||pi -P 2 IP, 
we have, ||VCi(pi) - VCi(p 2 )|| < T2||(7ri,pi) - (7r2,P2)||, where L 2 := a^jS. Since sum 
of two lipschitz continuous functions is lipschitz continuous, we have, V(/(-) + C'i(-)) is 
lipschitz continous with lipschitz constant L := Li + L 2 . I 


In order to study the asymptotic behaviour of the recursion presented in the algorithm, 
by results in Section 3.4 of [12], it is enough to study the asymptotic behaviour of the 
o.d.e., 

'7f\ ,/7r\ / V^/(7r,p) 


p 


= 7 


p 


V,(f(7r,p) + C,(p)) 


) 


( 1 ) 


where Vn G S x Tjc, Vd G y(i;; d) = hm ^ pe_ the directional derivative 

of H(-) at V along the direction d. The above o.d.e. is well posed i.e. has a unique 
solution for every initial point in S x T^c (for a proof see [ 12 ] )• 

S X Sc, is a cartesian product of simplices and hence the projection of (tt, p) G M^i +^2 
on to S X Sc is the same as projection of tt* on to S*, Vi G / and p on to Sc i.e. 
H((7i'^, p^Y') = [idmi( 7 r^)^,..., Hm 2 (pYY where Vn G N, Hn(-) denotes 

the projection operator which projects every vector in M” on to A"' C M"’. Thus, in order 
to compute the directional derivative of H(-), it is enough to consider the directional 
derivative of the projection operator on to individual simplices and then juxtaposing 
them would give us the directional derivative of H(-). 

The computation of the directional derivative of a projection operation on to a simplex 
can be found in [12] which we shall state here. Let Vn G A", Vd G 7^", 7 „(r;;d) : = 
lim := {x G M” : | |a;| | = 1, {x,v — v) < 0, Vh G A'^}. Then, 


7 n(n; d) = d + (max{{d, -x„), ^})xr^ 


( 2 ) 


where Xn G g(v), s.t., Vx G p(v), (d, —Xn) > (d, —x). 
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Let \/{7i,p) G S X Sc, V{7i,p) := /(7r,p) + Ci{p). Fix {7io,Po) G S x Sc be a initial 
point of the o.d.e. [^and the corresponding unique solution be ( 7 r(t),p(t)). Then, 


dV{7i{t),p{t)) 

dt 


= vi/(.(i).p(i))^((;) ; - c,(rt))) 

= X] V^*'F(7r(t),p(t))^7^^(7r*;-V^i(/(7r,p) + C'i(p))) 


iei 


VpV{7i{t),p{t)fjM2{p] -Vp(/(7r,p) + C'i(p))) 


By substituing 1^ and the fact that \/{n,p) G S x Sc, W{7i,p) = V(/( 7 r,p) + Ci{p)) 
in the above equation we get. 


<5^(-||V^»/(7r,p)|p + |(V^i/(7r,p),x^jn 


iei 

+ 


|Vp(/(7r,p) + C'i(p))|p + \{Vp{f{Ti,p) + Ci{p)),XM2)? 


<0. 


where the last inequality follows from the application of cauchy schwartz and the fact 
that Vn G N, ||x„|| = 1. 

Therefore along every solution of the o.d.e. the value of the potential function V{-) 
reduces and hence the above o.d.e. converges to an internally chain transitive invariant 
set contained in C := {(vr*,p*) G S x Sc : ^ = 0}. 

In the following lemma we shall prove that ( 7 r*,p*) G £ is an equilibrium point of 

o.d.e. [U 


Lemma 6.2: If {7i*,p*) G £, then, 7 


)= 0 . 


V,/(7rLp*) 

[Vpif{7r*,p*) + C,ip*))^ 

Proof: If(7r*,p*) G £ is such that V(/(7r*,p*)+(7i(p*)) = 0, then 7((7r*,p*); V(/(7r*,p*) + 
Clip*))) = 0- Assume V(/(7r*,p*)+Ci(p*)) ^0. Since (7r*,p*) G £, i](-||V^i/(7r,p)||2 + 

KV^i/(7r,p),a;mJ|2) + (-||Vp(/(7r,p) + Ciip))\\^ + |(Vp(/(7r,p) + Ci(p)), xjua) H = 0. 
By cauchy schwartz inequality, Vi G /, (—||Vjri/(7r,p)|p + |(V^i/(7r,p),a;mi)P) < 0 
and (-||Vp(/(7r,p) + C'i(p))|p + |(Vp(/(7r,p) + Ciip)),XM2)?) < 0. Since their sum 
is zero, we get, Vi G /, (-||V^i/(7r,p)|p + |(V^*/(7r,p),a;m,)n = 0 and (-||Vp(/(7r,p) + 
C'i(p))lP + KVp(/(7r,p) + Clip)), XM2)?) = 0- Hence, Vi G /, x^. = and 

xm 2 - =*=||Vp(/( 7 r*J*)+cl(p*))||- definition of m equation @ we get, Vi G /, Xrm = 

Substituing for Xrm and Xm 2 in the expres- 




Vp(/(^*.p*)+Ci(V)) 


and xm 2 \\\/MC*x*)+Ciip* 


l|V,^i/(7r*,p* 

Sion for 7mi((vr*)*; -V^i(/( 7 r*,p*) + Clip*))) and 7m2(p*; -Vp(/( 7 r*,p*) + C'i(p*))) and 
using the fact that lUn*,p*)]-Vifin*,p*) + Ci(p*))) = [( 7 mi(( 7 r^)*;-V^i(/( 7 r*,p*) + 

Ciip *))))^,..., (7,n,((vr^)V -V^Mifi7,*,p*)+Ciip*))))^, ( 7 m 2 (p*; -Vpif ii^*,p*)+Ciip*))))^Y 

we get the desired result. ■ 

In fact the converse is also true and the proof is similar to that of the previous lemma. 
Therefore £ = £ where £ denotes the set of equilibrium points of o.d.ejlj The following 
lemma says that every point in the set £ is a partial optimum of the biconvex function 

/(7r,p) + C'i(p). 
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Lemma 6.3: {7r*,p*) E C, then, Vtt E S, f{7r*,p*) + Ci{p*) < f{7r,p*) + Ci{p*) and 
Wp E Sc, f{7r*,p*) + C^ip*) < f{7r*,p) + C^ip). 

Proof : If {tt*,p*) E C is such that V(/(7r*,p*) + Ci{p*)) = 0, then by lemma 4.5 
the result follows. Assume V(/(7r*,p*) + Ci{p*)) ^ 0. Then by lemma 6.2 we have, 


|Vi/(7r*,p* 


and r,. - ^,P*)+Ci{p*)) 

ana xmj - ||Vp(/( 7 r*,p*)+Ci(p*))| 


xm 2 ^ vip*) and hence Vp E Sc, ( 


Vp(/^yp*)+c’i(p*)) 


,p* — p) < 0. 


Vi E /, Xnij 

By equation ^ ^ ,iyy j wy ^ ^^c, \||Vp(/( 7 r*,p*)+Ci(p* 

Therefore Vp G Sc, (Vp{f {tt* ,p*) + Ci{p*)),p — p*) > 0. By convexity of /(tt*, •) +Ci(-) 
and proposition 1.1.8 in 0, we get Vp G Sc, f{n*,p*) + Ci{p*) < f{7i*,p) + Clip). 

By equationj2, Vi G /, Xrm E p(7r**) and hence Vi G I, Vvr* G S*, ( ||v"'*/(I* p*)|| > ~ 

TT*) < 0. Therefore Vi G /, Vvr* G S*, (V.,r»/(7r*,p*),vr* — (vr*)*) > 0. Since Vvr G 
S, (V.(/(7rVp*) + C'i(p*)),7r-7r*) = {V . f {ti* , p*), 7i - tv*) = 

iei 

we get, Vtt gS, (Vvr(/(7r*,p*)+Ci(p*)), tt—tt*) >0. Thus by convexity of/(•,p*)+C'i(p*) 
and by proposition 1.1.8 in [g, we have, Vtt G S, f{7i*,p*) + Ci{p*) < f{7i,p*) + Ci{p*). 


Even though the proof guarantees convergence to the set of partial optimum of the 
biconvex function in simulation on various test cases it was observed that the iterates 
converge to the set of Nash equilibria of the game T. 

7 Simulation results. 

In the simulations carried out, in order to perform the projection operation in every 
iteration we use the procedure in HH 

7.1 Rock-Paper-Scissor : 

We consider the following version of the standard rock-paper-scissor game. 



R 

p 


R 

(0,0) 

(0,1) 

(1.0) 

P 

(1.0) 

(0.0) 

(0,1) 

S 

(0.1) 

(1.0) 

(0,0) 


In the above game, ((|, |, |),(|, |, |))is the only Nash equilibrium strategy. Having 
started the algorithm from a random initial point, variation of the objective function 
value and the strategies are shown in the plots below. 

The plots in Fig{^ show that the action probabilities converge to the Nash equilib¬ 
rium of the game. As the action probabilities converge to Nash equilibrium strategy the 
objective function value approaches zero as seen in Figj^ 
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(a) Action probabilities of player 1 vs itera- (b) Action probabilities of player 2 vs iteration 
tion index. index. 

Figure 1: Action probabilities vs iteration index 



Iteration index 


Figure 2: Objective function value vs iteration index. 


7.2 Jordan’s game : 

The general form of Jordan’s game can be found in [13]. We consider the following 
version. 


Player 3 action af: 


Player 3 action a| : 



d{ 

^2 

a[ 

(0,0,0) 

(1,1,0) 


(1,0,1) 

(0,1,1) 



a{ 

02 

a{ 

(0,1,1) 

(1,0,1) 


(1,1,0) 

(0,0,0) 


In the above game, ((|, |),(|, i))is the only Nash equilibrium strategy. Having 

started the algorithm from a random initial point, variation of the objective function 
value and the strategies are shown in the plots in Figj^and Figj^ 
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(a) Action probabilities of player 1 vs it- (b) Action probabilities of player 2 vs iteration 
eration index. index. 

Figure 3: Action probabilities vs iteration index 




(a) Action probabilities of player 3 vs it- (b) Objective function value vs iteration in- 
eration index. dex. 

Figure 4: Action probabilities/Objective function value vs iteration index 


Simulations were also carried out on other versions of this game obtained from the 
general form in [13] and convergence to Nash equilibrium was observed. 

7.3 A game with finite number of Nash equilibria : 

The following game was introduced in |13] in order to show non-convergence of certain 
class of algorithms. The game is stated below. 



d{ 

^2 

ai 

a\ 

(1,0) 

(0,1) 

(1,0) 


(0,1) 

(1,0) 

(1,0) 

Os 

(0,1) 

(0,1) 

(1,1) 


In the above game, ((|, 0), (|, 0)) and ((0, 0, 1), (0, 0, 1)) are the two Nash equi¬ 

librium strategies. Having started the algorithm from a random initial point, variation of 
the objective function value and the strategies are shown in the plots in Figj^and Figj^ 
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(a) Action probabilities of player 1 vs it- (b) Action probabilities of player 2 vs it¬ 
eration index. eration index. 


Figure 5: Action probabilities vs iteration index 



Iteration index 


Figure 6: Objective function value vs iteration index. 


7.4 A game with infinite Nash equilibria : 



d{ 

o'i 

aj 

(3,0) 

(12,0) 

0-2 

(3, -2) 

(2,-5) 


In the above game, {((a, 1—a), (1, 0)) : 0 < a < 1}U{((1, 0), (a, 1—a)) : 0 < a < 1} 
is the set of Nash equilbria. Having started the algorithm from a random initial point, 
variation of the objective function value and the strategies are shown in the plots in Figj^ 
and Figj^ 
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(a) Action probabilities of player 1 vs itera- (b) Action probabilities of player 2 vs 
tion index. iteration index. 


Figure 7: Action probabilities vs iteration index 



iteration index 

Figure 8: Objective function value vs iteration index. 


8 Summary and directions for future work. 

We have presented optimization problems {O.P.l and O.P.2) such that the global minima 
of these optimization problems are Nash equilibria of the game F. The objective functions 
were shown to be bi-convex and in case of O.P.l the objective function was also shown to 
be an invex function. We also considered a projected gradient descent scheme and proved 
that it converges to a partial optimum of the objective function. Even though the proof 
gaurantees convergence to the set of partial optimum in various test cases considered we 
have seen convergence to a Nash equilibrium strategy. 

In future we wish to extend the above optimization problem formulation to discounted 
stochastic games and prove convergence to Nash equilibrium or construct a counter exam¬ 
ple where the algorithm converges to a partial optimum which is not a Nash equilibrium 
strategy. 
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